Overview
Brought to you by YData
Dataset statistics
| Number of variables | 15 |
|---|---|
| Number of observations | 18594 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 7.6 MiB |
| Average record size in memory | 429.0 B |
Variable types
| Text | 1 |
|---|---|
| Numeric | 9 |
| Categorical | 3 |
| DateTime | 2 |
Amount_Funded_By_Lender is highly overall correlated with Lender_portion_Funded and 3 other fields | High correlation |
Lender_portion_Funded is highly overall correlated with Amount_Funded_By_Lender and 2 other fields | High correlation |
Lender_portion_to_be_repaid is highly overall correlated with Amount_Funded_By_Lender and 3 other fields | High correlation |
Total_Amount is highly overall correlated with Amount_Funded_By_Lender and 3 other fields | High correlation |
Total_Amount_to_Repay is highly overall correlated with Amount_Funded_By_Lender and 3 other fields | High correlation |
country_id is highly overall correlated with Lender_portion_Funded and 4 other fields | High correlation |
customer_id is highly overall correlated with country_id and 2 other fields | High correlation |
duration is highly overall correlated with loan_type | High correlation |
lender_id is highly overall correlated with country_id and 2 other fields | High correlation |
loan_type is highly overall correlated with Total_Amount and 4 other fields | High correlation |
tbl_loan_id is highly overall correlated with country_id and 1 other fields | High correlation |
loan_type is highly imbalanced (69.0%) | Imbalance |
New_versus_Repeat is highly imbalanced (92.5%) | Imbalance |
Total_Amount is highly skewed (γ1 = 112.8584063) | Skewed |
Total_Amount_to_Repay is highly skewed (γ1 = 116.5174333) | Skewed |
Amount_Funded_By_Lender is highly skewed (γ1 = 20.79293646) | Skewed |
ID has unique values | Unique |
Amount_Funded_By_Lender has 1868 (10.0%) zeros | Zeros |
Lender_portion_Funded has 1868 (10.0%) zeros | Zeros |
Lender_portion_to_be_repaid has 1953 (10.5%) zeros | Zeros |
Reproduction
| Analysis started | 2025-01-06 20:11:38.863057 |
|---|---|
| Analysis finished | 2025-01-06 20:11:57.341448 |
| Duration | 18.48 seconds |
| Software version | ydata-profiling vv4.12.1 |
| Download configuration | config.json |
Variables
ID
Text
Unique 
| Distinct | 18594 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.2 MiB |
Length
| Max length | 21 |
|---|---|
| Median length | 21 |
| Mean length | 20.990212 |
| Min length | 19 |
Unique
| Unique | 18594 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | ID_269404226088267278 |
|---|---|
| 2nd row | ID_255356300042267278 |
| 3rd row | ID_257026243764267278 |
| 4th row | ID_264617299409267278 |
| 5th row | ID_247613296713267278 |
| Value | Count | Frequency (%) |
| id_269404226088267278 | 1 | < 0.1% |
| id_250225217173267278 | 1 | < 0.1% |
| id_271847294122267278 | 1 | < 0.1% |
| id_308399367770267278 | 1 | < 0.1% |
| id_253278278418267278 | 1 | < 0.1% |
| id_260080290274267278 | 1 | < 0.1% |
| id_256877248892267278 | 1 | < 0.1% |
| id_297079364851297183 | 1 | < 0.1% |
| id_260008268836267278 | 1 | < 0.1% |
| id_253042249843267278 | 1 | < 0.1% |
| Other values (18584) | 18584 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 81767 | |
| 7 | 47571 | |
| 6 | 41266 | |
| 8 | 32878 | |
| 5 | 24933 | 6.4% |
| 4 | 24416 | 6.3% |
| 3 | 23770 | 6.1% |
| 9 | 23029 | 5.9% |
| I | 18594 | 4.8% |
| D | 18594 | 4.8% |
| Other values (3) | 53474 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 390292 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 2 | 81767 | |
| 7 | 47571 | |
| 6 | 41266 | |
| 8 | 32878 | |
| 5 | 24933 | 6.4% |
| 4 | 24416 | 6.3% |
| 3 | 23770 | 6.1% |
| 9 | 23029 | 5.9% |
| I | 18594 | 4.8% |
| D | 18594 | 4.8% |
| Other values (3) | 53474 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 390292 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 2 | 81767 | |
| 7 | 47571 | |
| 6 | 41266 | |
| 8 | 32878 | |
| 5 | 24933 | 6.4% |
| 4 | 24416 | 6.3% |
| 3 | 23770 | 6.1% |
| 9 | 23029 | 5.9% |
| I | 18594 | 4.8% |
| D | 18594 | 4.8% |
| Other values (3) | 53474 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 390292 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 2 | 81767 | |
| 7 | 47571 | |
| 6 | 41266 | |
| 8 | 32878 | |
| 5 | 24933 | 6.4% |
| 4 | 24416 | 6.3% |
| 3 | 23770 | 6.1% |
| 9 | 23029 | 5.9% |
| I | 18594 | 4.8% |
| D | 18594 | 4.8% |
| Other values (3) | 53474 |
customer_id
Real number (ℝ)
High correlation 
| Distinct | 4962 |
|---|---|
| Distinct (%) | 26.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 262489.51 |
| Minimum | 6083 |
|---|---|
| Maximum | 312696 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 145.4 KiB |
Quantile statistics
| Minimum | 6083 |
|---|---|
| 5-th percentile | 241751 |
| Q1 | 250357 |
| median | 259107 |
| Q3 | 270051.25 |
| 95-th percentile | 297741 |
| Maximum | 312696 |
| Range | 306613 |
| Interquartile range (IQR) | 19694.25 |
Descriptive statistics
| Standard deviation | 28957.313 |
|---|---|
| Coefficient of variation (CV) | 0.11031798 |
| Kurtosis | 28.539836 |
| Mean | 262489.51 |
| Median Absolute Deviation (MAD) | 9524 |
| Skewness | -3.6393518 |
| Sum | 4.8807299 × 109 |
| Variance | 8.3852597 × 108 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 296718 | 60 | 0.3% |
| 296758 | 55 | 0.3% |
| 296992 | 47 | 0.3% |
| 297596 | 46 | 0.2% |
| 296562 | 46 | 0.2% |
| 296745 | 45 | 0.2% |
| 297741 | 44 | 0.2% |
| 247613 | 44 | 0.2% |
| 296998 | 43 | 0.2% |
| 296287 | 43 | 0.2% |
| Other values (4952) | 18121 |
| Value | Count | Frequency (%) |
| 6083 | 2 | < 0.1% |
| 7154 | 4 | < 0.1% |
| 7411 | 12 | |
| 7651 | 5 | < 0.1% |
| 7907 | 1 | < 0.1% |
| 8256 | 1 | < 0.1% |
| 8454 | 19 | |
| 12897 | 18 | |
| 14932 | 1 | < 0.1% |
| 22710 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 312696 | 1 | < 0.1% |
| 312654 | 2 | < 0.1% |
| 312651 | 2 | < 0.1% |
| 312608 | 1 | < 0.1% |
| 312432 | 2 | < 0.1% |
| 312384 | 1 | < 0.1% |
| 312241 | 1 | < 0.1% |
| 312159 | 5 | |
| 312026 | 1 | < 0.1% |
| 311981 | 1 | < 0.1% |
country_id
Categorical
High correlation 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 980.7 KiB |
| Kenya | |
|---|---|
| Ghana |
Length
| Max length | 5 |
|---|---|
| Median length | 5 |
| Mean length | 5 |
| Min length | 5 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Kenya |
|---|---|
| 2nd row | Kenya |
| 3rd row | Kenya |
| 4th row | Kenya |
| 5th row | Kenya |
Common Values
| Value | Count | Frequency (%) |
| Kenya | 15069 | |
| Ghana | 3525 | 19.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| kenya | 15069 | |
| ghana | 3525 | 19.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 22119 | |
| n | 18594 | |
| K | 15069 | |
| e | 15069 | |
| y | 15069 | |
| G | 3525 | 3.8% |
| h | 3525 | 3.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 92970 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| a | 22119 | |
| n | 18594 | |
| K | 15069 | |
| e | 15069 | |
| y | 15069 | |
| G | 3525 | 3.8% |
| h | 3525 | 3.8% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 92970 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| a | 22119 | |
| n | 18594 | |
| K | 15069 | |
| e | 15069 | |
| y | 15069 | |
| G | 3525 | 3.8% |
| h | 3525 | 3.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 92970 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| a | 22119 | |
| n | 18594 | |
| K | 15069 | |
| e | 15069 | |
| y | 15069 | |
| G | 3525 | 3.8% |
| h | 3525 | 3.8% |
tbl_loan_id
Real number (ℝ)
High correlation 
| Distinct | 17067 |
|---|---|
| Distinct (%) | 91.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 282416.63 |
| Minimum | 104034 |
|---|---|
| Maximum | 375320 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 145.4 KiB |
Quantile statistics
| Minimum | 104034 |
|---|---|
| 5-th percentile | 217958.6 |
| Q1 | 240880.5 |
| median | 273442.5 |
| Q3 | 304856 |
| 95-th percentile | 365207.35 |
| Maximum | 375320 |
| Range | 271286 |
| Interquartile range (IQR) | 63975.5 |
Descriptive statistics
| Standard deviation | 52907.549 |
|---|---|
| Coefficient of variation (CV) | 0.18733864 |
| Kurtosis | -0.53525991 |
| Mean | 282416.63 |
| Median Absolute Deviation (MAD) | 31915 |
| Skewness | 0.31085808 |
| Sum | 5.2512549 × 109 |
| Variance | 2.7992087 × 109 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 364300 | 3 | < 0.1% |
| 364043 | 3 | < 0.1% |
| 364217 | 3 | < 0.1% |
| 364013 | 3 | < 0.1% |
| 363955 | 3 | < 0.1% |
| 364044 | 3 | < 0.1% |
| 364014 | 3 | < 0.1% |
| 364301 | 3 | < 0.1% |
| 364358 | 3 | < 0.1% |
| 364297 | 3 | < 0.1% |
| Other values (17057) | 18564 |
| Value | Count | Frequency (%) |
| 104034 | 1 | |
| 104601 | 1 | |
| 104603 | 1 | |
| 105198 | 1 | |
| 105348 | 1 | |
| 105353 | 1 | |
| 105883 | 1 | |
| 105966 | 1 | |
| 106386 | 1 | |
| 110034 | 1 |
| Value | Count | Frequency (%) |
| 375320 | 1 | |
| 375315 | 1 | |
| 375306 | 1 | |
| 375304 | 2 | |
| 375289 | 1 | |
| 375288 | 2 | |
| 375286 | 1 | |
| 375280 | 1 | |
| 375277 | 1 | |
| 375275 | 1 |
lender_id
Real number (ℝ)
High correlation 
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 271876.75 |
| Minimum | 245684 |
|---|---|
| Maximum | 297183 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 145.4 KiB |
Quantile statistics
| Minimum | 245684 |
|---|---|
| 5-th percentile | 267277 |
| Q1 | 267278 |
| median | 267278 |
| Q3 | 267278 |
| 95-th percentile | 297183 |
| Maximum | 297183 |
| Range | 51499 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 12349.646 |
|---|---|
| Coefficient of variation (CV) | 0.045423693 |
| Kurtosis | 0.5212782 |
| Mean | 271876.75 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.2138236 |
| Sum | 5.0552763 × 109 |
| Variance | 1.5251376 × 108 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 267278 | 14221 | |
| 296542 | 1803 | 9.7% |
| 297183 | 1264 | 6.8% |
| 251804 | 761 | 4.1% |
| 296540 | 179 | 1.0% |
| 297182 | 163 | 0.9% |
| 245684 | 157 | 0.8% |
| 267277 | 46 | 0.2% |
| Value | Count | Frequency (%) |
| 245684 | 157 | 0.8% |
| 251804 | 761 | 4.1% |
| 267277 | 46 | 0.2% |
| 267278 | 14221 | |
| 296540 | 179 | 1.0% |
| 296542 | 1803 | 9.7% |
| 297182 | 163 | 0.9% |
| 297183 | 1264 | 6.8% |
| Value | Count | Frequency (%) |
| 297183 | 1264 | 6.8% |
| 297182 | 163 | 0.9% |
| 296542 | 1803 | 9.7% |
| 296540 | 179 | 1.0% |
| 267278 | 14221 | |
| 267277 | 46 | 0.2% |
| 251804 | 761 | 4.1% |
| 245684 | 157 | 0.8% |
loan_type
Categorical
High correlation  Imbalance 
| Distinct | 22 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 999.0 KiB |
| Type_1 | |
|---|---|
| Type_3 | |
| Type_7 | 592 |
| Type_2 | 454 |
| Type_5 | 298 |
| Other values (17) | 593 |
Length
| Max length | 7 |
|---|---|
| Median length | 6 |
| Mean length | 6.0087663 |
| Min length | 6 |
Unique
| Unique | 4 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Type_1 |
|---|---|
| 2nd row | Type_1 |
| 3rd row | Type_1 |
| 4th row | Type_1 |
| 5th row | Type_1 |
Common Values
| Value | Count | Frequency (%) |
| Type_1 | 13618 | |
| Type_3 | 3039 | 16.3% |
| Type_7 | 592 | 3.2% |
| Type_2 | 454 | 2.4% |
| Type_5 | 298 | 1.6% |
| Type_4 | 253 | 1.4% |
| Type_6 | 98 | 0.5% |
| Type_10 | 96 | 0.5% |
| Type_9 | 42 | 0.2% |
| Type_8 | 37 | 0.2% |
| Other values (12) | 67 | 0.4% |
Length
| Value | Count | Frequency (%) |
| type_1 | 13618 | |
| type_3 | 3039 | 16.3% |
| type_7 | 592 | 3.2% |
| type_2 | 454 | 2.4% |
| type_5 | 298 | 1.6% |
| type_4 | 253 | 1.4% |
| type_6 | 98 | 0.5% |
| type_10 | 96 | 0.5% |
| type_9 | 42 | 0.2% |
| type_8 | 37 | 0.2% |
| Other values (12) | 67 | 0.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| T | 18594 | |
| y | 18594 | |
| p | 18594 | |
| e | 18594 | |
| _ | 18594 | |
| 1 | 13783 | |
| 3 | 3047 | 2.7% |
| 7 | 595 | 0.5% |
| 2 | 473 | 0.4% |
| 5 | 299 | 0.3% |
| Other values (5) | 560 | 0.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 111727 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| T | 18594 | |
| y | 18594 | |
| p | 18594 | |
| e | 18594 | |
| _ | 18594 | |
| 1 | 13783 | |
| 3 | 3047 | 2.7% |
| 7 | 595 | 0.5% |
| 2 | 473 | 0.4% |
| 5 | 299 | 0.3% |
| Other values (5) | 560 | 0.5% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 111727 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| T | 18594 | |
| y | 18594 | |
| p | 18594 | |
| e | 18594 | |
| _ | 18594 | |
| 1 | 13783 | |
| 3 | 3047 | 2.7% |
| 7 | 595 | 0.5% |
| 2 | 473 | 0.4% |
| 5 | 299 | 0.3% |
| Other values (5) | 560 | 0.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 111727 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| T | 18594 | |
| y | 18594 | |
| p | 18594 | |
| e | 18594 | |
| _ | 18594 | |
| 1 | 13783 | |
| 3 | 3047 | 2.7% |
| 7 | 595 | 0.5% |
| 2 | 473 | 0.4% |
| 5 | 299 | 0.3% |
| Other values (5) | 560 | 0.5% |
Total_Amount
Real number (ℝ)
High correlation  Skewed 
| Distinct | 9372 |
|---|---|
| Distinct (%) | 50.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 14465.074 |
| Minimum | 5 |
|---|---|
| Maximum | 20000000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 145.4 KiB |
Quantile statistics
| Minimum | 5 |
|---|---|
| 5-th percentile | 719 |
| Q1 | 2101.9 |
| median | 4740 |
| Q3 | 10267.75 |
| 95-th percentile | 50000 |
| Maximum | 20000000 |
| Range | 19999995 |
| Interquartile range (IQR) | 8165.85 |
Descriptive statistics
| Standard deviation | 156908.47 |
|---|---|
| Coefficient of variation (CV) | 10.847402 |
| Kurtosis | 14191.573 |
| Mean | 14465.074 |
| Median Absolute Deviation (MAD) | 3169 |
| Skewness | 112.85841 |
| Sum | 2.6896358 × 108 |
| Variance | 2.4620266 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1500 | 183 | 1.0% |
| 5000 | 148 | 0.8% |
| 10000 | 66 | 0.4% |
| 2199 | 56 | 0.3% |
| 4699 | 50 | 0.3% |
| 2250 | 45 | 0.2% |
| 2000 | 45 | 0.2% |
| 2649 | 43 | 0.2% |
| 6000 | 43 | 0.2% |
| 6499 | 36 | 0.2% |
| Other values (9362) | 17879 |
| Value | Count | Frequency (%) |
| 5 | 2 | < 0.1% |
| 10 | 5 | |
| 27 | 1 | < 0.1% |
| 30 | 1 | < 0.1% |
| 50 | 2 | < 0.1% |
| 70 | 1 | < 0.1% |
| 100 | 2 | < 0.1% |
| 109 | 1 | < 0.1% |
| 110 | 1 | < 0.1% |
| 112 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 20000000 | 1 | < 0.1% |
| 3986325 | 1 | < 0.1% |
| 3006566 | 1 | < 0.1% |
| 2263187 | 2 | < 0.1% |
| 1100000 | 1 | < 0.1% |
| 1000000 | 1 | < 0.1% |
| 837817.2 | 1 | < 0.1% |
| 810119.1 | 1 | < 0.1% |
| 800000 | 5 | |
| 790609 | 1 | < 0.1% |
Total_Amount_to_Repay
Real number (ℝ)
High correlation  Skewed 
| Distinct | 10963 |
|---|---|
| Distinct (%) | 59.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15784.161 |
| Minimum | 0 |
|---|---|
| Maximum | 24152842 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 145.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 730 |
| Q1 | 2164.48 |
| median | 4828 |
| Q3 | 10567.573 |
| 95-th percentile | 52990 |
| Maximum | 24152842 |
| Range | 24152842 |
| Interquartile range (IQR) | 8403.0925 |
Descriptive statistics
| Standard deviation | 187189.29 |
|---|---|
| Coefficient of variation (CV) | 11.859312 |
| Kurtosis | 14891.287 |
| Mean | 15784.161 |
| Median Absolute Deviation (MAD) | 3210 |
| Skewness | 116.51743 |
| Sum | 2.9349069 × 108 |
| Variance | 3.5039829 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 5176 | 98 | 0.5% |
| 1555 | 54 | 0.3% |
| 1500 | 40 | 0.2% |
| 2199 | 37 | 0.2% |
| 4699 | 33 | 0.2% |
| 2249 | 25 | 0.1% |
| 10700 | 24 | 0.1% |
| 2649 | 23 | 0.1% |
| 6211 | 22 | 0.1% |
| 2250 | 21 | 0.1% |
| Other values (10953) | 18217 |
| Value | Count | Frequency (%) |
| 0 | 1 | < 0.1% |
| 1.19 | 2 | |
| 6 | 2 | |
| 10.15 | 1 | < 0.1% |
| 10.7 | 1 | < 0.1% |
| 11 | 3 | |
| 30.45 | 1 | < 0.1% |
| 33.14 | 1 | < 0.1% |
| 52 | 1 | < 0.1% |
| 70 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 24152842 | 1 | |
| 4205572.88 | 1 | |
| 3167417.28 | 1 | |
| 2395583.44 | 2 | |
| 1240486 | 1 | |
| 1092000 | 1 | |
| 896050 | 2 | |
| 894873 | 1 | |
| 855316 | 2 | |
| 850384.46 | 1 |
| Distinct | 656 |
|---|---|
| Distinct (%) | 3.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 145.4 KiB |
| Minimum | 2021-11-08 00:00:00 |
|---|---|
| Maximum | 2024-11-14 00:00:00 |
| Invalid dates | 0 |
| Invalid dates (%) | 0.0% |
due_date
Date
| Distinct | 728 |
|---|---|
| Distinct (%) | 3.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 145.4 KiB |
| Minimum | 2021-11-15 00:00:00 |
|---|---|
| Maximum | 2025-01-16 00:00:00 |
| Invalid dates | 0 |
| Invalid dates (%) | 0.0% |
duration
Real number (ℝ)
High correlation 
| Distinct | 50 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13.530763 |
| Minimum | 1 |
|---|---|
| Maximum | 849 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 145.4 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 7 |
| Q1 | 7 |
| median | 7 |
| Q3 | 7 |
| 95-th percentile | 14 |
| Maximum | 849 |
| Range | 848 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 36.437325 |
|---|---|
| Coefficient of variation (CV) | 2.6929247 |
| Kurtosis | 60.061692 |
| Mean | 13.530763 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 7.0122942 |
| Sum | 251591 |
| Variance | 1327.6786 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 7 | 17365 | |
| 14 | 311 | 1.7% |
| 180 | 290 | 1.6% |
| 30 | 211 | 1.1% |
| 90 | 56 | 0.3% |
| 365 | 51 | 0.3% |
| 240 | 39 | 0.2% |
| 300 | 37 | 0.2% |
| 60 | 33 | 0.2% |
| 210 | 26 | 0.1% |
| Other values (40) | 175 | 0.9% |
| Value | Count | Frequency (%) |
| 1 | 5 | < 0.1% |
| 3 | 4 | < 0.1% |
| 4 | 8 | < 0.1% |
| 5 | 2 | < 0.1% |
| 6 | 2 | < 0.1% |
| 7 | 17365 | |
| 8 | 1 | < 0.1% |
| 14 | 311 | 1.7% |
| 15 | 2 | < 0.1% |
| 20 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 849 | 1 | < 0.1% |
| 365 | 51 | |
| 360 | 9 | < 0.1% |
| 330 | 1 | < 0.1% |
| 300 | 37 | |
| 270 | 5 | < 0.1% |
| 265 | 3 | < 0.1% |
| 240 | 39 | |
| 210 | 26 | |
| 183 | 2 | < 0.1% |
New_versus_Repeat
Categorical
Imbalance 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| Repeat Loan | |
|---|---|
| New Loan | 169 |
Length
| Max length | 11 |
|---|---|
| Median length | 11 |
| Mean length | 10.972733 |
| Min length | 8 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Repeat Loan |
|---|---|
| 2nd row | Repeat Loan |
| 3rd row | Repeat Loan |
| 4th row | Repeat Loan |
| 5th row | Repeat Loan |
Common Values
| Value | Count | Frequency (%) |
| Repeat Loan | 18425 | |
| New Loan | 169 | 0.9% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| loan | 18594 | |
| repeat | 18425 | |
| new | 169 | 0.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 37019 | |
| a | 37019 | |
| 18594 | ||
| L | 18594 | |
| o | 18594 | |
| n | 18594 | |
| R | 18425 | |
| p | 18425 | |
| t | 18425 | |
| N | 169 | 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 204027 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 37019 | |
| a | 37019 | |
| 18594 | ||
| L | 18594 | |
| o | 18594 | |
| n | 18594 | |
| R | 18425 | |
| p | 18425 | |
| t | 18425 | |
| N | 169 | 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 204027 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 37019 | |
| a | 37019 | |
| 18594 | ||
| L | 18594 | |
| o | 18594 | |
| n | 18594 | |
| R | 18425 | |
| p | 18425 | |
| t | 18425 | |
| N | 169 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 204027 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 37019 | |
| a | 37019 | |
| 18594 | ||
| L | 18594 | |
| o | 18594 | |
| n | 18594 | |
| R | 18425 | |
| p | 18425 | |
| t | 18425 | |
| N | 169 | 0.1% |
Amount_Funded_By_Lender
Real number (ℝ)
High correlation  Skewed  Zeros 
| Distinct | 9704 |
|---|---|
| Distinct (%) | 52.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2278.4301 |
| Minimum | 0 |
|---|---|
| Maximum | 400000 |
| Zeros | 1868 |
| Zeros (%) | 10.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 145.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 239.36 |
| median | 744.575 |
| Q3 | 1998 |
| 95-th percentile | 9147.525 |
| Maximum | 400000 |
| Range | 400000 |
| Interquartile range (IQR) | 1758.64 |
Descriptive statistics
| Standard deviation | 6784.4298 |
|---|---|
| Coefficient of variation (CV) | 2.9776773 |
| Kurtosis | 848.23058 |
| Mean | 2278.4301 |
| Median Absolute Deviation (MAD) | 639.435 |
| Skewness | 20.792936 |
| Sum | 42365130 |
| Variance | 46028487 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1868 | 10.0% |
| 450 | 145 | 0.8% |
| 1000 | 115 | 0.6% |
| 1200 | 50 | 0.3% |
| 659.7 | 43 | 0.2% |
| 600 | 38 | 0.2% |
| 1574.7 | 34 | 0.2% |
| 1496.7 | 32 | 0.2% |
| 10000 | 31 | 0.2% |
| 179.7 | 29 | 0.2% |
| Other values (9694) | 16209 |
| Value | Count | Frequency (%) |
| 0 | 1868 | |
| 0.01 | 11 | 0.1% |
| 0.02 | 6 | < 0.1% |
| 0.03 | 2 | < 0.1% |
| 0.04 | 4 | < 0.1% |
| 0.05 | 3 | < 0.1% |
| 0.06 | 4 | < 0.1% |
| 0.07 | 4 | < 0.1% |
| 0.08 | 1 | < 0.1% |
| 0.09 | 3 | < 0.1% |
| Value | Count | Frequency (%) |
| 400000 | 1 | < 0.1% |
| 216666 | 1 | < 0.1% |
| 200000 | 2 | |
| 190000 | 1 | < 0.1% |
| 140000 | 2 | |
| 119798.1 | 1 | < 0.1% |
| 100000 | 3 | |
| 98060.7 | 1 | < 0.1% |
| 97606 | 1 | < 0.1% |
| 93750 | 1 | < 0.1% |
Lender_portion_Funded
Real number (ℝ)
High correlation  Zeros 
| Distinct | 3880 |
|---|---|
| Distinct (%) | 20.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.20708981 |
| Minimum | 0 |
|---|---|
| Maximum | 1 |
| Zeros | 1868 |
| Zeros (%) | 10.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 145.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.13131313 |
| median | 0.3 |
| Q3 | 0.3 |
| 95-th percentile | 0.3 |
| Maximum | 1 |
| Range | 1 |
| Interquartile range (IQR) | 0.16868687 |
Descriptive statistics
| Standard deviation | 0.12208543 |
|---|---|
| Coefficient of variation (CV) | 0.58952889 |
| Kurtosis | 2.6668197 |
| Mean | 0.20708981 |
| Median Absolute Deviation (MAD) | 0.016608123 |
| Skewness | 0.071617548 |
| Sum | 3850.628 |
| Variance | 0.014904852 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.3 | 9232 | |
| 0 | 1868 | 10.0% |
| 0.2 | 721 | 3.9% |
| 0.16 | 457 | 2.5% |
| 0.1333333333 | 150 | 0.8% |
| 0.13 | 150 | 0.8% |
| 0.5 | 74 | 0.4% |
| 0.1578947368 | 66 | 0.4% |
| 0.1588785047 | 61 | 0.3% |
| 0.1308411215 | 58 | 0.3% |
| Other values (3870) | 5757 |
| Value | Count | Frequency (%) |
| 0 | 1868 | |
| 2.589533107 × 10-7 | 1 | < 0.1% |
| 4.528165187 × 10-7 | 1 | < 0.1% |
| 9.04077389 × 10-7 | 1 | < 0.1% |
| 1.036537963 × 10-6 | 1 | < 0.1% |
| 1.042807237 × 10-6 | 1 | < 0.1% |
| 1.294665976 × 10-6 | 1 | < 0.1% |
| 1.364628821 × 10-6 | 1 | < 0.1% |
| 1.724435247 × 10-6 | 1 | < 0.1% |
| 1.750087504 × 10-6 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 1 | 32 | |
| 0.9993333333 | 1 | < 0.1% |
| 0.9988766065 | 1 | < 0.1% |
| 0.9958333333 | 1 | < 0.1% |
| 0.95 | 1 | < 0.1% |
| 0.9415521064 | 1 | < 0.1% |
| 0.9090909091 | 1 | < 0.1% |
| 0.8911067546 | 1 | < 0.1% |
| 0.8 | 9 | < 0.1% |
| 0.75 | 1 | < 0.1% |
Lender_portion_to_be_repaid
Real number (ℝ)
High correlation  Zeros 
| Distinct | 6782 |
|---|---|
| Distinct (%) | 36.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2466.452 |
| Minimum | 0 |
|---|---|
| Maximum | 423400 |
| Zeros | 1953 |
| Zeros (%) | 10.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 145.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 244.035 |
| median | 758.92 |
| Q3 | 2041 |
| 95-th percentile | 10146.734 |
| Maximum | 423400 |
| Range | 423400 |
| Interquartile range (IQR) | 1796.965 |
Descriptive statistics
| Standard deviation | 7680.0818 |
|---|---|
| Coefficient of variation (CV) | 3.1138177 |
| Kurtosis | 694.62888 |
| Mean | 2466.452 |
| Median Absolute Deviation (MAD) | 651.08 |
| Skewness | 19.076854 |
| Sum | 45861208 |
| Variance | 58983657 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1953 | 10.5% |
| 1035 | 95 | 0.5% |
| 1 | 64 | 0.3% |
| 2 | 46 | 0.2% |
| 467 | 43 | 0.2% |
| 4 | 42 | 0.2% |
| 450 | 42 | 0.2% |
| 675 | 34 | 0.2% |
| 660 | 34 | 0.2% |
| 3 | 32 | 0.2% |
| Other values (6772) | 16209 |
| Value | Count | Frequency (%) |
| 0 | 1953 | |
| 1 | 64 | 0.3% |
| 2 | 46 | 0.2% |
| 3 | 32 | 0.2% |
| 4 | 42 | 0.2% |
| 5 | 25 | 0.1% |
| 6 | 19 | 0.1% |
| 7 | 14 | 0.1% |
| 8 | 12 | 0.1% |
| 9 | 15 | 0.1% |
| Value | Count | Frequency (%) |
| 423400 | 1 | |
| 236599 | 1 | |
| 223718 | 1 | |
| 216259 | 1 | |
| 210700 | 1 | |
| 184489.07 | 1 | |
| 152900 | 2 | |
| 125027.39 | 1 | |
| 123794.19 | 1 | |
| 107503 | 1 |
Interactions
Correlations
| Amount_Funded_By_Lender | Lender_portion_Funded | Lender_portion_to_be_repaid | New_versus_Repeat | Total_Amount | Total_Amount_to_Repay | country_id | customer_id | duration | lender_id | loan_type | tbl_loan_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Amount_Funded_By_Lender | 1.000 | 0.567 | 0.999 | 0.000 | 0.644 | 0.644 | 0.018 | -0.086 | 0.331 | -0.201 | 0.246 | -0.157 |
| Lender_portion_Funded | 0.567 | 1.000 | 0.563 | 0.232 | -0.078 | -0.079 | 0.717 | -0.232 | -0.105 | -0.222 | 0.485 | -0.378 |
| Lender_portion_to_be_repaid | 0.999 | 0.563 | 1.000 | 0.009 | 0.645 | 0.646 | 0.039 | -0.081 | 0.340 | -0.199 | 0.261 | -0.152 |
| New_versus_Repeat | 0.000 | 0.232 | 0.009 | 1.000 | 0.000 | 0.000 | 0.046 | 0.096 | 0.093 | 0.152 | 0.319 | 0.145 |
| Total_Amount | 0.644 | -0.078 | 0.645 | 0.000 | 1.000 | 0.999 | 0.000 | -0.120 | 0.359 | -0.241 | 0.912 | -0.091 |
| Total_Amount_to_Repay | 0.644 | -0.079 | 0.646 | 0.000 | 0.999 | 1.000 | 0.000 | -0.115 | 0.363 | -0.240 | 0.816 | -0.086 |
| country_id | 0.018 | 0.717 | 0.039 | 0.046 | 0.000 | 0.000 | 1.000 | 0.913 | 0.326 | 0.995 | 0.999 | 0.884 |
| customer_id | -0.086 | -0.232 | -0.081 | 0.096 | -0.120 | -0.115 | 0.913 | 1.000 | 0.069 | 0.503 | 0.344 | 0.553 |
| duration | 0.331 | -0.105 | 0.340 | 0.093 | 0.359 | 0.363 | 0.326 | 0.069 | 1.000 | -0.144 | 0.666 | 0.111 |
| lender_id | -0.201 | -0.222 | -0.199 | 0.152 | -0.241 | -0.240 | 0.995 | 0.503 | -0.144 | 1.000 | 0.801 | 0.483 |
| loan_type | 0.246 | 0.485 | 0.261 | 0.319 | 0.912 | 0.816 | 0.999 | 0.344 | 0.666 | 0.801 | 1.000 | 0.456 |
| tbl_loan_id | -0.157 | -0.378 | -0.152 | 0.145 | -0.091 | -0.086 | 0.884 | 0.553 | 0.111 | 0.483 | 0.456 | 1.000 |
Missing values
Sample
| ID | customer_id | country_id | tbl_loan_id | lender_id | loan_type | Total_Amount | Total_Amount_to_Repay | disbursement_date | due_date | duration | New_versus_Repeat | Amount_Funded_By_Lender | Lender_portion_Funded | Lender_portion_to_be_repaid | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | ID_269404226088267278 | 269404 | Kenya | 226088 | 267278 | Type_1 | 1919.0 | 1989.0 | 2022-07-27 | 2022-08-03 | 7 | Repeat Loan | 575.7 | 0.300000 | 597.0 |
| 1 | ID_255356300042267278 | 255356 | Kenya | 300042 | 267278 | Type_1 | 2138.0 | 2153.0 | 2022-11-16 | 2022-11-23 | 7 | Repeat Loan | 0.0 | 0.000000 | 0.0 |
| 2 | ID_257026243764267278 | 257026 | Kenya | 243764 | 267278 | Type_1 | 8254.0 | 8304.0 | 2022-08-24 | 2022-08-31 | 7 | Repeat Loan | 207.0 | 0.025079 | 208.0 |
| 3 | ID_264617299409267278 | 264617 | Kenya | 299409 | 267278 | Type_1 | 3379.0 | 3379.0 | 2022-11-15 | 2022-11-22 | 7 | Repeat Loan | 1013.7 | 0.300000 | 1014.0 |
| 4 | ID_247613296713267278 | 247613 | Kenya | 296713 | 267278 | Type_1 | 120.0 | 120.0 | 2022-11-10 | 2022-11-17 | 7 | Repeat Loan | 36.0 | 0.300000 | 36.0 |
| 5 | ID_271847294122267278 | 271847 | Kenya | 294122 | 267278 | Type_1 | 3438.0 | 3471.0 | 2022-11-05 | 2022-11-12 | 7 | Repeat Loan | 1031.4 | 0.300000 | 1041.0 |
| 6 | ID_308399367770267278 | 308399 | Kenya | 367770 | 267278 | Type_7 | 5000.0 | 5181.0 | 2024-07-17 | 2024-07-24 | 7 | Repeat Loan | 1000.0 | 0.200000 | 1036.0 |
| 7 | ID_253278278418267278 | 253278 | Kenya | 278418 | 267278 | Type_1 | 3917.0 | 3917.0 | 2022-10-10 | 2022-10-17 | 7 | Repeat Loan | 117.0 | 0.029870 | 117.0 |
| 8 | ID_256877248892267278 | 256877 | Kenya | 248892 | 267278 | Type_1 | 4799.0 | 4799.0 | 2022-08-31 | 2022-09-07 | 7 | Repeat Loan | 0.0 | 0.000000 | 0.0 |
| 9 | ID_262156246268267278 | 262156 | Kenya | 246268 | 267278 | Type_1 | 5708.0 | 5885.0 | 2022-08-27 | 2022-09-03 | 7 | Repeat Loan | 120.0 | 0.021023 | 124.0 |
| ID | customer_id | country_id | tbl_loan_id | lender_id | loan_type | Total_Amount | Total_Amount_to_Repay | disbursement_date | due_date | duration | New_versus_Repeat | Amount_Funded_By_Lender | Lender_portion_Funded | Lender_portion_to_be_repaid | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 18584 | ID_269603285181267278 | 269603 | Kenya | 285181 | 267278 | Type_1 | 3139.00 | 3181.00 | 2022-10-20 | 2022-10-27 | 7 | Repeat Loan | 0.00 | 0.000000 | 0.00 |
| 18585 | ID_268168247705267278 | 268168 | Kenya | 247705 | 267278 | Type_1 | 9539.00 | 9765.00 | 2022-08-30 | 2022-09-06 | 7 | Repeat Loan | 0.00 | 0.000000 | 0.00 |
| 18586 | ID_251462269485267278 | 251462 | Kenya | 269485 | 267278 | Type_1 | 4399.00 | 4399.00 | 2022-09-28 | 2022-10-05 | 7 | Repeat Loan | 124.05 | 0.028200 | 124.00 |
| 18587 | ID_260626298889267278 | 260626 | Kenya | 298889 | 267278 | Type_1 | 4648.00 | 4681.00 | 2022-11-14 | 2022-11-21 | 7 | Repeat Loan | 1394.40 | 0.300000 | 1404.00 |
| 18588 | ID_246989262183267278 | 246989 | Kenya | 262183 | 267278 | Type_5 | 50000.00 | 52600.00 | 2022-09-19 | 2022-10-03 | 14 | Repeat Loan | 8000.00 | 0.160000 | 8416.00 |
| 18589 | ID_297596365331297183 | 297596 | Ghana | 365331 | 297183 | Type_3 | 1730.41 | 1782.32 | 2023-02-09 | 2023-02-16 | 7 | Repeat Loan | 269.41 | 0.155689 | 279.77 |
| 18590 | ID_259715231897267278 | 259715 | Kenya | 231897 | 267278 | Type_1 | 1534.00 | 1534.00 | 2022-08-04 | 2022-08-11 | 7 | Repeat Loan | 460.20 | 0.300000 | 460.00 |
| 18591 | ID_296701364008297183 | 296701 | Ghana | 364008 | 297183 | Type_3 | 1372.21 | 1413.30 | 2022-06-23 | 2022-06-30 | 7 | Repeat Loan | 178.67 | 0.130208 | 178.67 |
| 18592 | ID_268271242864267278 | 268271 | Kenya | 242864 | 267278 | Type_1 | 5608.00 | 5781.00 | 2022-08-23 | 2022-08-30 | 7 | Repeat Loan | 0.00 | 0.000000 | 0.00 |
| 18593 | ID_248929241821267278 | 248929 | Kenya | 241821 | 267278 | Type_1 | 4038.00 | 4038.00 | 2022-08-22 | 2022-08-29 | 7 | Repeat Loan | 0.00 | 0.000000 | 0.00 |